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ABSTRACT 



The present invention discloses a pattern matching system 
applicable for syllable recognition which includes a dictio- 
nary means fox storing a plurality of standard patterns each 
representing a standard syllable by at least a syllable feature. 
The pattern matching system further includes a converting 
means for converting an input pattern representing an 
unknown syllable into a categorizing pattern for represent- 
ing the unknown syllable in the syllable features used for 
representing the standard syllables. The pattern matching 
system further includes a Bayesian categorizing means for 
matching the standard pattern representing the standard 
syllable and the categorizing pattern representing the 
unknown syllable for computing a Bayesian mis- 
categorization risk for each of the standard syllables, the 
Bayesian categorization means further including a compar- 
ing and identification means for selecting a standard syllable 
which has the least mis-categorization risk as an identified 
syllable for the input unknown syllable. 

14 Claims 4 Drawing Sheets 
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APPARATUS AND METHOD FOR Therefore, there is still a need in the art of manufacturing 

NORMALIZING AND CATEGORIZING the fiber optic device to provide an apparatus and method 

LINEAR PREDICTION CODE VECTORS such that the manufacturing steps of fusing and stretching 

USING BAYESIAN CATEGORIZATION can be precisely measured and controlled to assure high 

TECHNIQUE 5 quality of optical devices are consistently produced. For the 

purpose of enabling the mass production of these high 
This is a continuation of application Ser. No. 08/160,580, quality optical devices, the apparatus and method must be 
filed Dec. 1, 1993 now abandoned reliable and simple to use such that the processing steps 

would not become too complicate and that the manufacture 
BACKGROUND OF THE INVENTION io ing cost can be maintained at a reasonable level. 



1. Field of the Invention 

This invention relates generally to an apparatus and 
method for speech recognition. More particularly, this 
invention relates to the apparatus and method for syllable 
waveform compression and accurate recognition by the use 
of simplified Bayesian techniques whereby the processing 
time for syllable recognition is shortened. 

2. Description of the Prior Art 

The non-linear dynamic characteristics of expansion and 
contraction and the sequential time-varying features of the 
syllable pronunciations greatly complicate the tasks of auto- 
matic speech recognition. In order to accurately recognize 
the uttered speech, a computerized speech recognition sys- 
tem must first extract the linguistic information from the 
acoustic signal by first determining and discarding the 
extra-linguistic data. The extra-linguistic data contained in 
the acoustic signals may include characteristic features of 
the speaker's identity, speaker's physiological and psycho- 
logical states, and the acoustic environment such as the 
surrounding noises* The speech recognition system must 
men normalize a sequence of feature vectors which is used 
to characterize the utterance now represented by the linguis- 
tic portion of the acoustic signals. These tasks are quite 
complex and would generally take considerable amount of 
computer time to accomplish. Since for an automatic speech 
recognition system to be practically useful these tasks must 
be performed in a real time basis, the requirement of extra 
computer processing time may often limit the development 
of a real-time computerized speech recognition system. 

There are on-going efforts to improve the capability of 
syllable recognition. Several techniques have been devel- 
oped to perform two major tasks of syllable recognition, 
namely the tasks of features extraction and utterance clas- 
sification. Before the task of feature extraction is performed, 
the physical utterance in the form of speech waveforms are 
first measured including the measurements of energy, zero 
crossings, extrema count, formants and LPC coefficients. 
Using the LPC coefficients for representation of the speech 
utterances provides a robust, reliable and accurate method 
for estimating the parameters that characterize the linear, 
time-varying system which is used to approximate the 
nonlinear, time-varying characteristics of the speech wave- 
forms. There are several methods used to perform the task of 
utterance classification. Few of these methods which have 
been practically used in automatic speech recognition sys- 
tems are dynamic time wrapping (DTW) patteni-matching, 
vector quantization (VQ), and hidden Markov model 
(HMM> The DTW methodology which provides nonlinear 
time-axis expansions or contractions of an input phoneme 
which is then matched with the phoneme or landmark 
positions of the template phonemes. Dynamic programming 
techniques are used for pattern matching in DTW which has 
shown some successful results. However, since the dynamic 
programming techniques are very computational intensive 
and requires extraordinary computer processing time, this 
method is not practically useful for real-time application. 



SUMMARY OF THE PRESENT INVENTION 

It is therefore an object of the present invention to provide 
an apparatus and method to overcome the aforementioned 

is difficulties encountered in the prior art 

Specifically, it is an object of the present invention to 
provide an apparatus and method to improve the speed of 
syllable recognition by the use of more effective waveform 
compression and classification methodologies whereby real 

20 time syllable recognition becomes more achievable. 

Another object of the present invention is to provide a 
syllable recognition system and method wherein the syllable 
utterance waveforms are compressed into feature vectors by 
employing techniques which is simple to save computation 

25 resources yet capable of generating feature vectors which 
characterize all major dynamic features of the syllable. 

Another object of the present invention is to provide a 
syllable recognition system and method wherein the classi- 

^ ficarion of the compressed feature vectors for syllable rec- 
ognition are accomplished by the use of Bayesian techniques 
which is systematic and can be conveniently automated and 
optimized such mat modem processing power can be easily 
applied for syllable recognition. 

35 Briefly, in a preferred embodiment, the present invention 
discloses a pattern matching system applicable for syllable 
recognition which includes a dictionary means for storing a 
plurality of standard patterns each representing a standard 
syllable by at least a syllable feature. The pattern inatching 

4Q system further includes a converting means for converting 
an input pattern representing an unknown syllable into a 
categorizing pattern for representing the unknown syllable 
in the syllable features used for representing the standard 
syllables. Hie pattern matching system further includes a 

45 Bayesian categorizing means for matching the standard 
pattern representing the standard syllable and the categoriz- 
ing pattern representing the unknown syllable for computing 
a Bayesian mis-categorization risk for each of the standard 
syllables, the Bayesian categorization means further includ- 

50 ing a comparing and identification means for selecting a 
standard syllable which has the least mis-categorization risk 
as an identified syllable for the input unknown syllable. 

It is an advantage of the present invention that it provides 
an apparatus and method to improve the speed of syllable 

55 recognition by the use of more effective waveform com- 
pression and classification methodologies whereby real time 
syllable recognition becomes more achievable. 

Another advantage of the present invention is that it 
provides a syllable recognition system and method wherein 

60 the syllable utterance waveforms are compressed into fea- 
ture vectors by employing techniques which is simple to 
save computation resources yet capable of generating fea- 
ture vectors which characterize all major dynamic features 
of the syllable. 

65 Another advantage of the present invention is that it 
provides a syllable recognition system and method wherein 
the classification of the compressed feature vectors for 
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syllable recognition are accomplished by the use of Baye- Bayesian categorization means Included In the speech prc- 
sian techniques which is systematic and can be conveniently cessar further includes a comparing and identification means 
automated and optimized such that modern processing for selecting a standard syllable which has the least mis- 
power can be easily applied for syllable recognition. categorization risk as an identified syllable for the input 

These and other objects and advantages of the present 5 unJmown W^*- 
invention will no doubt become obvious to those of ordinary / for matduDg and categorizing an input pattern 

skill in the art after having read the following detailed ° f waveform apphcabk for syllable recognmon is also 
description of the prefexredembodiment which is illustrated *"* osed ? ^Present invention. The method comprises 
. ^ . A ■ c uestepsof:(a)storugmadictioriarymeans,mcludedinthe 
in the various drawing figures. ^ Aita g se M a ^Btf of standard patterns each 

BRIEF DESCRIPTION OF THE DRAWINGS KyKK^^^^^ 

FIG. 1 is a schematic diagram illustrating the system representing an unknown syllable into a categorizing pattern 

configuration of a speech recognition system according to for representing the unknown syllable by the syllable fea- 

the present invention; . , tures used for representing the standard syllables; (c) match- 

. . | • c ^ iflg the standard pattern representing the standard syllable 

FIG. 2 shows a plurality or waveforms iepresentine the . « . f - , , 

llflbl tterance al f and the categorizing pattern representing the unknown syl- 

syuaoie utterance in analog terms; ^ ^ utilizing a Bayesian categorizing means, included in 

FIG. 3 is a flow-chart diagram showing the processing speech processor 20, for computing a Bayesian mis- 
steps of a Bayesian classification and identification means categorization risk far each of the standard syllables ; and (d) 
for classifying and identifying an input syllable; and 20 comparing ^ selecting a standard syllable which has the 

FIG. A is a flow-chart diagram showing the processing least mis-categorization risk as an identified syllabic far the 

steps performed by the speech recognition of FIG. 1 for input unknown syllable. 

identifying an input syllable utterance. in receiving the digitized waveform data, the preprocessor 

„ 16 first assumes that this series of sampled speech utterance 

DETAILED DESCRIPTION OF THE 25 in digital farm as measured can be represented as s(n), and 

PREFERRED EMBODIMENT that each sampled waveform s(n) can be linearly predicted 

FIG. 1 shows a speech recognition system 10 which from toe past p samples of s(n). Then, a Linear approximation 

includes an audio receiver 12 for receiving a series of speech of ^ Lc - s '< n > formulated as: 

waveforms representing a syllable utterance. FIG. 2 shows 30 

a plurality of these wave forms and the waveforms received = i ootx* - k) * 1 * 

by the audio receiver 12 are in continuous analogy form. A 4=1 

digital converter 14 then converts these waveforms into a where the coefficients a(k), fc= 1,2,3, ... p are generally 

series of digital signals. A preprocessor 16 receives these referred to as the linear predict coding (LPC) coefficients 

digital signals from the digital converter 14 to compute a set 35 and can be solved by the least square method. Let E be the 

of linear predictive coding (LPC) coefficients and then squared difference between s(n) and s*(n) over N samples of 

transform these coefficients into a corresponding set of LPC s(n), then E may be represented as: 
cepstra. (The details of these computations will be described 

below) This set of LPC cepstra are inputted to a speech b- N I 1 m -tfn P ^ 

processor 20 and/or a database 22. The tasks performed by 40 " to 

the speech processor 20, which will be described below, ^ , ^ . _ _ . ^ * 

IndudlnTthrcompression of the cepstra to extract dynamic TJ*, 1 ^ ^ "Z- C ^, b f USe ? !° * tam J he v , aluM ° f 

features* the syllable utterance according to the compres- &eIJ<; coefficients, Le. a(k) by muurnaMgmevaJueofE. 

slon methodologies of the present invention to be described Various t( ^ ues f e ^°Pf' f™* th ^P^ ban * 
below and then using thcsVcomprcssed cepstra to identify 45 revive procedure (see JJ^oul Lmear Prediction: A 

the syllable by classifi^on-T^pee^hr^ognMon system J^ 1 . 6 * ' ^ ^ ™ f ' °° 4 ' f ^"fS 

10 further includes an user interface means 18 to allow an ^ ^ 3 > £ - cB * aeat mc *° d , for co ^^S the LPC 

user of the system 10 to control the system operation and to coefficients. The least square method is weU known * toe art 

provide user input as data or commands Tto the speech f" d comracrc ? 1 soit ^ K ar " cadl1 / avaJa ^ lc f f 

recognition system 10. 50*' ° f ~ n ^ g coefficients according to 

. ' . ,. ... . „ ,_. Equation (2). The details of the least square computations 

A pattoo matching system appkcable for syllable recog- ^ therefore not be repeated here in this Patent Application, 

ration 10 is disclosed in this invention which comprises a j,, order to apply theBayesian techniques for waveform 

dictionary means included in the database 22 for storing a classiflcation for mMc recognition, the LPC coefficients 

plurality of standard patterns each representing a standard m ^ tnLs{ormei mt0 jjc cepstra, The following recur- 

syHable by at least a syllable feature. The pattera i matching 55 sive tions m used to me coefficients 

system also includes a converting means included in the a(k) (0 mc ^ cepstruln ^ 
preprocessor It! for converting an input pattern representing 

an unknown syllable (one example is shown in FIG. 2 ) into rf(i) = -o(r) (3) 
a categorizing pattern for representing the unknown syllable 

in the syllable features used for representing the standard 60 M . (4 > 
syllables. The pattern matching system 10 further includes a «W (i->*WW(<-/) i<**r 
Bayesian categorizing means included in the speech proces- 
sor 20 for matching the standard pattern representing the (i -V)a(f)<>C(i -J) ?<' 
standard syllable and the categorizing pattern representing 



the unknown syllable for computing a Bayesian mis- 65 For more details concerning the detailed mathematical for- 
categorization risk for each of the standard syllables. The mutations and the advantages of using the LPC cepstrum 
method of the computation will be discussed below. The please refer to 'Digital Speech Processing, Synthesis, and 
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Recognition* by Sadaoki Furui, Published by Marcel The second method of compression is by deleting the 

Dekker, Inc. (New York and Basel, 1989, P67.) stable portion of the LFC vectors. Let the difference of two 

The waveform of each syllable is now represented by a consecutive IPC vectors be denoted as: 
plurality of vectors wherein each vector comprises a plural- 
ity of cepstra. For example, each of the vectors for repre- 5 ««= | mm -m - m <7) 
seating a mandarin syllable typically includes 16 LPC (=i 

cepstra. Since the waveforms representing the utterance of a for k=2,3,4, . . . ji, and the LPC vector y(k) Is deleted if its 

syllabk by a speaker may vary from time to time, even for ^ ^ previous vector y(k _i) is below a 

a single syUable by the same speaker depending on the ^ ^ y (k) r vblle ^3 mandm^n 

duration of the utterance and vanous other factors rf mea- 1Q ^ fce MW Dce of ue lpc vectors after the above 

surement for each specific utterance. Consequently, the LPC £ on ^ ofthc LPC vectors are 

cepstrum vectors used for representing a syllable waveform ^ ^Jfc ^ ^ ^ secdons ud ead, sect ion has 

are often varying over time in a dynarrucaUy non-linear ^ average value of the LPC cepstra in each 

manner. For the purpose of syllable recognition toeorp- of these ten equal sections is used as a different-feature for 

strum vectors used for representing each single syllable must . scction 

be normalized and 'standardized' by a single set of vectors * q{ ^ ion to by ^ deleting fee 

each of which includes a single set of cepstra representing b parts of the LPC vectors as the second method by 

theunique features of the syUable. And then when a syllable ^J^ a imputation on the LPC vectors according to 

is uttered by a speaker, me waveform for ttiat specific * ^ a new ^ rf ^ vcctors< yl(k)t 

syllable must be rapidly collected and converted to LPC £ ^ y ^ meQ a c^^on according to the 

cepstrum vectors. These oepriram sectors must then be 20 L $i rf me of Wo consecutive 

normalized and compared with this set of standardized ty(k)i-y(k-l)il, is performed based on the 

cepstrum vectors to determine a best category for tius compu\atk>n: 

uttered syllable in order to perform the task of speech ^ ^ 

recognition. m „ rg) 

Because of the nonlinear time-varying characteristics of 25 si = I r - j<* - W 

the syllable waveforms and the associated cepstrum vectors, 4=1 

special techniques are used in this invention to perform the The LPC vectors yl(k), fc=l*23, ... m, are divided into ten 
normalization. In order or to distinguish and then identify sections such that the sum of the differences of two con- 
each syllable, it should be noted an utterance of a syllable secutivc LPC vectors in each section is equal to S1710. The 
may be divided into two basic parts; i.e^ a stable part and a ^ average value of the LPC cepstra in each section is a 
feature part The feature part comprises wave patterns of sum-difference-feature of that section, 
peaks and valleys representing the unique characteristics of vVith the feature vectors for characterizing the waveforms 
the syllable utterance and the stable parts representing the of ±e syllables> a simplified Bayesian decision rule is 
flat waveform portions between two wave patterns of the utilized to distinguish and identify the syllables according to 
feature part In addition to the shape of the waveform for a me categorizations obtained by the computations applying 
syllable, the duration of utterance may have nonlinear 35 

expansion or contraction wherein the ™ speech processor 20 now receives the compressed 

different lengths between two wave patterns. In order to ^^TXJ^^hhv** v^t«r y^tM. tC2\ rfirt 

accurately identify the syllable utterance, a compression LPCcy l^^f^"^ ^^^« C iJ^ > from 

process must first be performed to remove the stable flat ) which is me input feamre vector of a 1 speech » fr«thc 

portions of a syllable waveform in order to extract only me 40 preprocessor 16^ Bis ithe task for the speech j*«:essc* 20 to 

feature part for syllable identification. Therefore, the nor- determine whether the input feature vector belongs to cat- 

malization process must also comprise a compression step egary C(i) wherein C(i) is one of the M categories, i.e., 

before the task of feature extraction can be earned out categories C(l), C(2), . . . C(M), and the data are stored in 

In order to resolve the limitations experienced by the prior the database 22. In order to determine whether the input 
art, this invention utilizes three processing techniques to 45 feature vector X belongs to a category C(i), the speech 

expedite the compression operation such that speech recog- processor 20 employs a simplified Bayes derision rule, 

nitdon can be practically performed in a real time basis These Explanation of the mathematical formulation Is provided 

three processing techniques are described below. below. 

The first processing technique is to perform a waveform Let fpOC(i)) be the conditional normal density function 
LPC cepstra compression on the basis of the absolute value ^ of X given category C(i) and the prior probability t be 

of the LPC vectors. The k-th frame of a speech waveform is constant, Le., each category has equal probability to occur. 

represented by a LPC vector, i.c., y(k) where k=123 n, A simple loss function for a decision rule d is used wherein 

and each of these vectors has p-components, Le., (y(k)l, the loss function is one when a rnis calculation is made and 

y(k)2, . . . y(k)p). The number of flames, Le., n, depends on one when the derision rule d is correct Let R(td) denotes a 
the length of the speech waveform. Let 55 risk function, i.e., the probability of miscalculation of d, and 

G(i) where i=l*23» . . . .M be m regions separated by the 

5= I I MM *® derision rule d in a k-dimensional domain of X, Le., d 

= fc=i t=i w decides C(i) when X is contained in G(i). The risk function 

be the total sum of the absolute values of the elements of a can be expressed as: 
LPC vector. The total length of the speech waveform is then 60 

segmented into ten sections such that each section has the m c 

absolute value of S/10 whereby the part of the speech ***> s £ x J C(l y AxSa)dx 

waveform with large absolute value of the LPC elements are Mlf 

divided into more segments. The average value of the LPC Where Q(if is the complement of G(i). Let D be the family 
cepstra is then calculated for each segment as a segment- 65 of all decision rules that separate M categories. Let the 

average which is used as a first feature characteristics for minimum probability of mis classification be denoted by 

compression. R(t): 
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value, A variety of post identification processes may be 
performed (step 75) before the data is added to the data base 

R(t)=iaf R(ud) (10) (step 80) which may be used as part of the reference pattern 

for the data base (step 64) far future identification of new 

M s syllable input 

Adecision rule that satisfies (10) is called the Bayes decision A "S*** 8 " VP®*** for .«3** te 

nil , _, Jth r«rJT nZrfe^n » XhiVh ™ nH* recognition is thus disclosed in the present invention, The 

?J ^ P attern matching system 10 comprises a receiver 12 for 

represented as: receiving an incoming syllable utterance of either a standard 

10 syllable or an unknown syllable in the form of 

At w- f *, « x /lt v wavefunctions, The pattern matching system 10 further 

d/ix;itf/(xD)>/(sic ; ) (11) indues ^ maXog to 00^^^ means 14 for 

for all M converting the wavefunctions to a plurality of digital data 

In order to achieve higher speed computation, the density ^J^^tT^ Sy S " 

^.n^; rt « *(*\n\ t* occ„«-jT n txm an A ^ f aoh ,« tern further includes a linear predictive coding (LPC) means 

function /(xICi) is assumed to be normd and the feature 15 M ^ . m preprocessor 16 for converting the digital 

vectors are assumed to be stochastically independent Tte fctarepresent^^ 

conditional density function is then approximately repre- ^ern rnVtehing system further includes a 

sented by a function as: speech processor 20 which includes a compression means 

for compressing the each LPC cepstra vector into a com- 

[ K -j t k ( 12 > 20 pressed cepstra vector, The pattern matching system further 

n y(YUT an) has a database 22 which includes a dictionary means for 

fci j storing a plurality of standard compressed cepstra vectors 

. . , - - m- < • . . „ . 4 each representing a standard syllable, The speech processor 

whcrei=UJ . ^ . , M and M is the niimber of syUables to 20 ^ includes a Bayesian categari^means for 

be recognized. For the purpose of classification, the loga- ^ matching each of the standard compressed cepstra vectors 

nthmic values of tfxl, . . . , xkl O) are compared: with the compressed cepstra vector of the unknown syllable 

for computing a Bayesian mis-categorization risk for each of 

£{Q = E logaj+<l/2) X (x-iiiSaa? (13> mc standard syllables. The Bayesian categorization means 

£=1 t=i further includes a comparing and identification means for 

ltlecategc^awmchru» S meleast^a)isidentifiedasthe so a ■^JSJ* Jj" * e ^ . mis ; 

11 u. v jTI • > * *. v v x categorization nsk as an identified syllabic for the input 

syllable which the input feature vector X=(X A , X,, . . . unk ^ 0WD sylIable . ^ patterll j^^g 10 fj^ 

belongs. « . . . includes an user interface means 18 to allow an user of the 
FIG. 3 is a flow chart diagram showing the processing pattern matching system 10 to input data and commands for 
steps performed by a Bayesian classification means included controlling the operation of the matching system. In a 
in the speech processor 20 to classify the compressed LPC 35 preferred embodiment, the linear predictive coding (LPC) 
cepstra to identify the input utterance waveforms as one of means included in the pre-processor 16 employs Equations 
the syllables. By assuming that the compressed LPC cepstra (1) to (5) for converting the digital data representing the 
(xl, . . , jdc) have a normal distribution, the Bayesian input wavefunction into a LPC cepstra vector. The compres- 
classifi cation means 16-2 first computes the mean u li and sion means, included in the speech processor 20 employs 
variance a 2 ^ where 1=1,23, . (step 40) for each category 40 one of the methods as described in Equations (6) to (8) for 
i representing a standard syllable utterance and store them in compressing the LPC cepstra vector into a compressed 
the database 22. Let (xl, . . . ,xk) be the compressed LPC for cepstra vector. And. the Bayesian categorizing means 
a new syllable utterance. The Bayesian classification means included in the speech processor 20 employs Equations (13) 
16-2 then computes the logarithmic value L(Ci) by the use for computing a Bayesian mis-categorization risk for each of 
of Equation (13) (step 42) for each category Q, where 45 the standard syllables and for selecting a standard syllable 
i=l,23, . . . M. The Bayesian classification means then which has the least mis-categorization risk as an identified 
compares the value of L(Ci) to determine a category Cs syllable for the input unknown syllable, 
which has the least value (step 44), Le.. Ls(Ci). The syllable The Speech recognition system 10 as disclosed in this 
represented by the category Cs is identified as the syllable of invention thus resolve the major difficulties of the prior art 
the input utterance. It should be noted that the data for each 50 by first utilizing an effective compression method to extract 
category representing a standard syllable utterance which the essential dynamic features of the waveforms represent- 
are used in the above computations are stored in the database ing the syllable utterance. The speech recognition system 10 
22 (see FIG. 1). then employs the Bayesian method which can be conve- 
ne 4 is a flow-chart diagram showing the processing niently programmed or implemented in hardware design to 
steps performed by the speech recognition system 10 to 55 perform the categorization and identification process in a 
accomplish the classification and identification of a syllable. high speed automated manner. The speed for speech recog- 
The speech signal is inputted and received (step 50) as nition is therefore substantially improved to allow real time 
digitized speech signals after being digitized by the A/D speech recognition operation. The accuracy of speech is also 
converter, A LPC coefficients and cepstra computation is improved because the compression method capture all the 
then performed (step 55) according to Equations (1) to (5), 60 major dynamic features of the waveforms representing the 
The LPC cepsta vectors are then compressed by a compres- syllables while the Bayesian categorization method provides 
sion means 16-1 of the speech processor 16 by the use of a systematic methodology to quantify the results of corn- 
Equations (6) to (8), The compressed LPC cepstra vectors parisons between different categories. By the use of the 
are then categorized (step 65) by the Bayesian classification speech recognition system and method as disclosed in the 
means 16-2 by the use of Equation (13), A syllable is 65 present invention, the task of computerized speech recog- 
identified (step 70) which is the category identified by the nition thus becomes more likely to be practically carried out 
Bayesian classification means 16-2 that has the least L(Ci) in a real time fashion. 
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Also disclosed in the present invention is a method for 
matching and categorizing an input pattern of waveform 
applicable for syllable recognition. The method comprises 
the steps of: (a) storing in a dictionary means a plurality of 
standard patterns each representing, by at least a standard 
syllable feature, a standard syllable (step 3* in FIG. 3); (b) 
converting the input pattern of waveform representing an 
unknown syllable into a categorizing pattern for represent- 
ing the unknown syllable by the syllable features used for 
representing the standard syllables (step 40); (c) matching 
the standard pattern representing the standard syllable and 
the categorizing pattern representing the unknown syllable 
by utilizing a Bayesian categorizing means for computing a 
Bayesian mis-categorization risk for each of the standard 
syllables (step 42); and (d) comparing and selecting a 
standard syllable which has the least mis-categorization risk 
as an identified syllable for the input unknown syllable (step 
44). 

In another preferred embodiment, the method for match- 
ing and categorizing an input pattern of waveform as 
described above wherein the step (a) further includes a step 
of converting each of the standard patterns into a standard 
LPC cepstra vector (step 55 in HO. 4) prior to storing the 
standard cepstra vectors in the dictionary means. And the 
step of converting the input pattern of waveform represent- 
ing an unknown syllable into a categorizing pattern is a step 
of converting the input pattern of waveform into a catego- 
rizing LPC cepstra vector (step 55) for matching with the 
standard LPC cepstra vectors in the step (c). 

In another preferred embodiment, the method for match- 
ing and categorizing an input pattern of waveform wherein 
the step (a) further includes a step of compressing each of 
the standard cepstra vectors into a standard compressed 
cepstra vector (step 60), by utilizing a compression means, 
prior to a step of storing the standard compressed cepstra 
vectors in the dictionary means. And, the step (c) further 
includes a step of compressing the categorizing LPC cepstra 
vector into a compressed categorizing LPC cepstra vector 
(step 60), by utilizing the compression means, prior to the 
step of matching with the standard compressed LPC cepstra 
vectors for computing a Bayesian mis-categorization risk for 
each of the standard syllables. 

In another preferred embodiment, the method for match- 
ing and categorizing an input pattern of waveform further 
comprises a step of (e) providing an user interface means 18 
to allow an user of the method for matching and categorizing 
the input pattern of waveform to input data and commands 
for controlling the operation of the method. 

In yet another preferred embodiment, the method for 
matching and categorizing an input pattern of waveform 
further comprises the steps of (al) providing a receiver 12 
for receiving the input pattern of waveform; and (a2) uti- 
lizing an analog to digital conversion means 14 for convert- 
ing the input pattern of waveform to a plurality of digital 
data representing the input pattern of waveform wherein the 
steps (al) and (a2) are performed prior to the performance 
of the step (a) as illustrated in FIG. 1. 

Although the present invention has been described in 
terms of the presently preferred embodiment, it is to be 
understood that such disclosure is not to be interpreted as 
limiting. Various alternations and modifications will no 
doubt become apparent to those skilled in the art after 
reading the above disclosure. Accordingly, it is intended that 
the appended claims be interpreted as covering all alterna- 
tions and modifications as fall within the true spirit and 
scope of the invention. 



10 



15 



20 



25 



30 



I claim: 

1. A pattern matching system provided for performing a 
sequence of single syllables recognition comprising: 

a dictionary means for storing a plurality of standard 
patterns wherein each of said standard patterns repre- 
senting a single standard syllable by a set of feature 

vectors C(l), C(2), C(3) and C(M) and M being 

a positive integer, 

a converting means for converting an input pattern rep- 
resenting single unknown syllable into a categorizing 
pattern for representing said single unknown syllable in 
a set of categorizing vectors X where X={x(l), x(2)a 
(3), . . . ,x(k)} where k representing a positive integer, 
and 

a Bay esian-deci sion-rule categorizing means for comput- 
ing a conditional normal density function /(xl Ci) for 
each of said feature vectors Ci, wherein said function 
f(x\ Ci) having a normal distribution and said x(l), 
x(2), x(3), ... and x(k) are stochastically independent; 
and 

said Bayesian-decision-niie categorizing means further 
employing functional parameters of said normal distri- 
bution for said normal density function f(x\ Q) to apply 
a Bayesian decision rule to detenninistically identify 
said single unknown syllable with one of said standard 
single syllables. 

2. The pattern matching system of claim 1 wherein: 
said Bayesian-decision-rule categorizing means further 

computing said conditional normal density function 
f(x\ Ci) as: 



35 
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where 1=1,23- . . , M and M is number of syllables to be 
recognized; 

said Bayesian-decision-rule categorizing means further 

computing logarithmic values of /(xl jdd Ci) for 

comparing values of L (C|) where: 



45 



for aeternunistically identifying a category G which has the 
least L(Ci) as said standard syllable for identifying with said 
single unknown syllable. 

3. The pattern matching system of claim 2 further com- 
so Prises: 

a linear predictive coding (LPC) means for converting 
each of said standard patterns into a LPC cepstra, and 
for converting said categorizing patterns into a catego- 
rizing LPC cepstra vector, and 
a compression means for compressing said LPC cepstra 
vector into a compressed standard pattern represented 

by said of feature vectors C(l), C(2), C(3) ,C(M), 

for storing in said dictionary means, and for compress- 
ing said categorizing LPC cepstra vectors into a com- 
pressed categorizing patterns represented by said set of 
categorizing vectors X where X={(x(l)jt(2),x(3), . . . 
tX(k)} for storing in said dictionary means. 

4. The pattern matching system of claim 2 further com- 
prises: 

65 an user interface means to allow an user of said pattern 
matching system to provide input data and commands 
for controlling the operation of said matching system. 



55 
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5. The pattern matching system of claim 4 further com- 
prises: 

a receiver for receiving an incoming syllable utterance in 

the form of wavefunctions; and 
an analog to digital conversion means for converting said 

wavefunctions to a plurality of digital data representing 

said wavefunctions. 

6. A pattern matching system provided for performing a 
sequence of single syllables recognition comprising: 

a receiver for receiving an incoming syllable utterance of 
either a standard syllable or a single unknown syllable 
for syllable recognition in the form of wavefunctions; 

an analog to digital conversion means for converting said 
wavefunctions to a plurality of digital data representing 
said wave functions; 

a linear predictive coding (LPC) means for converting 
said digital data representing said input wavefunction 
into a LPC cepstra vector, 

a compression means for compressing said each LPC 
cepstra vector into a compressed cepstra vector wherein 
said unknown syllable is represented by a set of cat- 
egorizing vectors X where X={x( 1)^(2)^(3), . . . 

a dictionary means for storing a plurality of standard 
compressed cepstra vectors each representing a stan- 
dard single syllable by a set of feature vectors C(l), 
C(2), C(3) and C(M); 

a Bayesian-decisionHrule categorizing means for comput- 
ing a conditional normal density function ;f(xl Ci) for 
each of said feature vectors Ci, assuming that said 
function f(xl Ci) having a normal distribution and said 
x(l), x(2), x(3), ... and x(k) are stochastically 
independent;, 

said Bayesian-decision-rule categorizing means further 
employing functional parameters of said normal distri- 
bution for said normal density function ;f(xl Ci) to apply 
a Bayesian decision rule to deterministically identify 
said single unknown syllable with one of said standard 
syllables; and 

an user interface means to low an user of said pattern 
matching system to provide input data and commands 
for controlling the operation of said matching system. 

7. The pattern matching system of claim 6 wherein: 
said compression means compressing said LPC cepstra 

vectors represented by Yk where Yfe={y(fc)i< y(k) 2 ,y 
(k>3, . • . ,y (k^} and 10=1 ,23 — , n, by deleting a stable 
portion of said vectors with a difference of two of said 
consecutive LPC cepstra vectors be denoted as: 

forfc=2, 3, 4, .... n, and by deleting one of said LPC cepstra 
vectors when said D(K) is below a pro-designated threshold 
value. 

8. The pattern matching system of daim 7 wherein: 

said compression means compressing said LPC cepstra 
vectors represented by Vk where Yfc={yWi» ?(k)* 
y'(k) 3 , . . « , y'(k) p } and K=l,23 . . . , m, according to 
a sum SI of absolute differences of two consecutive 
LPC cepstra vectors wherein: 
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5 and said LPC cepstra vectors are divided into a plurality 
sections with an average value of said LPC cepstra vectors 
in each of section characterized by a sum-difference-feature 
of said section. 

9. The pattern matching system of claim 6 wherein: 

to said compression means compressing said LPC cepstra 
vectors represented by Yk where YkHyOOi* y(t) 2 »y 

(k) 3 y (k) p ) and K=123 . . . , n, by normalizing 

each of said LPC cepstra vectors by applying a total 
sum of absolute values of said LPC cepstra vector S 

is wherein: 

2° and by dividing a syllable into a M sections wherein M is a 
positive integer and said LPC cepstra vectors for each 
section are normalized to an absolute value of S/M. 

10. A method for matching and categorizing an input 
pattern of waveform applicable for syllable recognition 

w comprising the steps of:. 

(a) storing in a dictionary means a plurality of standard 
patterns wherein each of said standard patterns repre- 
senting a standard single syllable by a set of feature 
vectors C(l), C(2), C(3), . . . , and C(M) and M being 

30 a positive integer, 

(b) converting said input pattern of waveform represent- 
ing an unknown single syllable into a categorizing 
pattern for representing said unknown syllable by a set 
of categorizing vectors X where X={x(l), x(2), x(3), . 

35 . . ,x(k)} and k being a positive integer; 

(c) utilizing a Bayesian-decision-rule categorizing means 
for computing a conditional normal density function 
/ (xl Ci) wherein said function f (xl Q) having a normal 
distribution and said x(l), x(2), x(3), ... and x(k) are 

40 stochastically independent;, and 

(d) employing functional parameters of said normal dis- 
tribution for said normal density function f(x\ Ci) to 
apply a Bayesian decision rule to identify said input 

45 unknown single syllable with one of said standard 
single syllables. 

11. The method for matching and categorizing an input 
pattern of waveform as recited in claim 10 wherein: 

said step (a) further includes a step of converting each of 
^ said standard patterns into a standard LPC cepstra 
vector; and 

said step of converting said input pattern of waveform 
representing an unknown single syllable into a catego- 
rizing pattern is a step of converting said input pattern 
55 of waveform into a categorizing LPC cepstra vector. 

12. The method for matching and categorizing an input 
pattern of waveform as recited in claim 11 wherein: 

said step (a) further includes a step of compressing each 

of said standard cepstra vectors into a set of standard 
6o compressed cepstra vectors represented by said of 

feature vectors C<1), C(2), C(3) and C(M). for 

storing in said dictionary means; and 
said step (c) further includes a step of compressing said 

categorizing LPC cepstra vector into a compressed 
65 categorizing LPC cepstra vector represent by said set of 

categorizing vectors X where X={x(l), x(2),x(3), . . . 

^(k)}. 
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13. The method for matching and categorizing an input 
pattern of waveform as recited in claim 12 further comprises 
a step of: 

(e) providing an user interface means to allow an user of 
said method for matching and categorizing said input 
pattern of waveform to input data and commands for 
controlling the operation of said method. 

14. The method for matching and categorizing an input 
pattern of waveform as recited in claim 13 further comprises 
the steps of: 
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(al) providing a receiver for receiving said input pattern 
of waveform; and 

(a2) utilizing an analog to digital conversion means for 
converting said input pattern of waveform to a plurality 
of digital data representing said input pattern of wave- 
form wherein said steps (al) and (a2) are performed 
prior to the performance of said step (a). 
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UNITED STATES PATENT AND TRADEMARK OFFICE 

CERTIFICATE OF CORRECTION 

PATENT NO. : 5,704,004 Page 1 of 2 

dated : December 30, 1997 
INVENTOR(S) : Tze Fen LI et fll 

It is certified that error appears in the above-indentified patent and that said Letters Patent is hereby 
corrected as shown below: 

Claim 1, column 10, line 13 , "representing" should 

read — represents — . 

Claim 6, column 11, line 44, "low" should read —allow — . 
— allow — ; 

line 36, "independent?." should read 
— independent ; — . 

Claim 8, column 11, line 61, "dain 7" should read 
— claim 7 — . 

Claim 10, column 12, line 25, "of:." should read 
— of: — . 

Claim 12, column 12, line 65, "represent- should read 
— represented — • 
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